17 research outputs found
An optimization framework for the capacity allocation and admission control of MapReduce jobs in cloud systems
Nowadays, we live in a Big Data world and many sectors of our economy are guided by data-driven decision processes. Big Data and Business Intelligence applications are facilitated by the MapReduce programming model, while, at infrastructural layer, cloud computing provides flexible and cost-effective solutions to provide on-demand large clusters. Capacity allocation in such systems, meant as the problem of providing computational power to support concurrent MapReduce applications in a cost-effective fashion, represents a challenge of paramount importance. In this paper we lay the foundation for a solution implementing admission control and capacity allocation for MapReduce jobs with a priori deadline guarantees. In particular, shared Hadoop 2.x clusters supporting batch and/or interactive jobs are targeted. We formulate a linear programming model able to minimize cloud resources costs and rejection penalties for the execution of jobs belonging to multiple classes with deadline guarantees. Scalability analyses demonstrated that the proposed method is able to determine the global optimal solution of the linear problem for systems including up to 10,000 classes in less than 1 s
Analytical composite performance models for Big Data applications
In the era of Big Data, whose digital industry is facing the massive growth of data size
and development of data intensive software, more and more companies are moving to use
new frameworks and paradigms capable of handling data at scale. The outstanding MapRe-
duce (MR) paradigm and its implementation framework, Hadoop are among the most re-
ferred ones, and basis for later and more advanced frameworks like Tez and Spark. Accurate
prediction of the execution time of a Big Data application helps improving design time de-
cisions, reduces over allocation charges, and assists budget management. In this regard, we
propose analytical models based on the Stochastic Activity Networks (SANs) to accurately
model the execution of MR, Tez and Spark applications in Hadoop environments governed
by the YARN Capacity scheduler. We evaluate the accuracy of the proposed models over the
TPC-DS industry benchmark across different configurations. Results obtained by numeri-
cally solving analytical SAN models show an average error of 6% in estimating the execution
time of an application compared to the data gathered from experiments and moreover the
model evaluation time is lower than simulation time of state of the art solutions
Medico-legal assessment of personal damage in older people: report from a multidisciplinary consensus conference
Ageing of the global population represents a challenge for national healthcare systems and healthcare professionals, including
medico-legal experts, who assess personal damage in an increasing number of older people. Personal damage evaluation in older
people is complex, and the scarcity of evidence is hindering the development of formal guidelines on the subject. The main
objectives of the first multidisciplinary Consensus Conference on Medico-Legal Assessment of Personal Damage in Older
People were to increase knowledge on the subject and establish standard procedures in this field. The conference, organized
according to the guidelines issued by the Italian National Institute of Health (ISS), was held in Bologna (Italy) on June 8, 2019
with the support of national scientific societies, professional organizations, and stakeholders. The Scientific Technical Committee
prepared 16 questions on 4 thematic areas: (1) differences in injury outcomes in older people compared to younger people and
their relevance in personal damage assessment; (2) pre-existing status reconstruction and evaluation; (3) medico-legal examination
procedures; (4) multidimensional assessment and scales. The Scientific Secretariat reviewed relevant literature and documents,
rated their quality, and summarized evidence. During conference plenary public sessions, 4 pairs of experts reported on
each thematic area. After the last session, a multidisciplinary Jury Panel (15 members) drafted the consensus statements. The
present report describes Conference methods and results, including a summary of evidence supporting each statement, and areas
requiring further investigation. The methodological recommendations issued during the Conference may be useful in several
contexts of damage assessment, or to other medico-legal evaluation fields
Performance Prediction of GPU-based Deep Learning Applications
Recent years saw an increasing success in the application of deep learning methods across various domains and for tackling different problems, ranging from image recognition and classification to text processing and speech recognition. In this paper we propose and validate an approach to model the execution time for training convolutional neural networks (CNNs) deployed on GPGPUs. We demonstrate that our approach is generally applicable to a variety of CNN models and different types of GPGPUs with high accuracy, aiming at the preliminary design phases for system sizing
Fluid Petri nets for the performance evaluation of MapReduce applications
Big Data applications allow to successfully analyze large amounts of data not necessarily structured, though at the same time they present new challenges. For example, predicting the performance of frameworks such as Hadoop can be a costly task, hence the necessity to provide models that can be a valuable support for designers and developers. This paper provides a new contribution in studying a novel modeling approach based on fluid Petri nets to predict MapReduce jobs execution time. The experiments we performed at CINECA, the Italian supercomputing center, have shown that the achieved accuracy is within 16% of the actual measurements on average